{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "code", "collapsed": false, "input": [ "# The usual preamble\n", "%matplotlib inline\n", "\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "# Make the graphs a bit prettier, and bigger\n", "pd.set_option('display.mpl_style', 'default')\n", "plt.rcParams['figure.figsize'] = (15, 5)\n", "\n", "# This is necessary to show lots of columns in pandas 0.12. \n", "# Not necessary in pandas 0.13.\n", "pd.set_option('display.line_width', 5000) \n", "pd.set_option('display.max_columns', 60)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "line_width has been deprecated, use display.width instead (currently both are\n", "identical)\n", "\n" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "One of the main problems with messy data is: how do you know if it's messy or not?\n", "\n", "We're going to use the NYC 311 service request dataset again here, since it's big and a bit unwieldy." ] }, { "cell_type": "code", "collapsed": false, "input": [ "requests = pd.read_csv('../data/311-service-requests.csv')" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stderr", "text": [ "/Users/admin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas/io/parsers.py:1159: DtypeWarning: Columns (8) have mixed types. Specify dtype option on import or set low_memory=False.\n", " data = self._reader.read(nrows)\n" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 7.1 How do we know if it's messy? " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We're going to look at a few columns here. I know already that there are some problems with the zip code, so let's look at that first.\n", " \n", "To get a sense for whether a column has problems, I usually use `.unique()` to look at all its values. If it's a numeric column, I'll instead plot a histogram to get a sense of the distribution.\n", "\n", "When we look at the unique values in \"Incident Zip\", it quickly becomes clear that this is a mess.\n", "\n", "Some of the problems:\n", "\n", "* Some have been parsed as strings, and some as floats\n", "* There are `nan`s \n", "* Some of the zip codes are `29616-0759` or `83`\n", "* There are some N/A values that pandas didn't recognize, like 'N/A' and 'NO CLUE'\n", "\n", "What we can do:\n", "\n", "* Normalize 'N/A' and 'NO CLUE' into regular nan values\n", "* Look at what's up with the 83, and decide what to do\n", "* Make everything strings" ] }, { "cell_type": "code", "collapsed": false, "input": [ "requests['Incident Zip'].unique()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 47, "text": [ "array([11432.0, 11378.0, 10032.0, 10023.0, 10027.0, 11372.0, 11419.0,\n", " 11417.0, 10011.0, 11225.0, 11218.0, 10003.0, 10029.0, 10466.0,\n", " 11219.0, 10025.0, 10310.0, 11236.0, nan, 10033.0, 11216.0, 10016.0,\n", " 10305.0, 10312.0, 10026.0, 10309.0, 10036.0, 11433.0, 11235.0,\n", " 11213.0, 11379.0, 11101.0, 10014.0, 11231.0, 11234.0, 10457.0,\n", " 10459.0, 10465.0, 11207.0, 10002.0, 10034.0, 11233.0, 10453.0,\n", " 10456.0, 10469.0, 11374.0, 11221.0, 11421.0, 11215.0, 10007.0,\n", " 10019.0, 11205.0, 11418.0, 11369.0, 11249.0, 10005.0, 10009.0,\n", " 11211.0, 11412.0, 10458.0, 11229.0, 10065.0, 10030.0, 11222.0,\n", " 10024.0, 10013.0, 11420.0, 11365.0, 10012.0, 11214.0, 11212.0,\n", " 10022.0, 11232.0, 11040.0, 11226.0, 10281.0, 11102.0, 11208.0,\n", " 10001.0, 10472.0, 11414.0, 11223.0, 10040.0, 11220.0, 11373.0,\n", " 11203.0, 11691.0, 11356.0, 10017.0, 10452.0, 10280.0, 11217.0,\n", " 10031.0, 11201.0, 11358.0, 10128.0, 11423.0, 10039.0, 10010.0,\n", " 11209.0, 10021.0, 10037.0, 11413.0, 11375.0, 11238.0, 10473.0,\n", " 11103.0, 11354.0, 11361.0, 11106.0, 11385.0, 10463.0, 10467.0,\n", " 11204.0, 11237.0, 11377.0, 11364.0, 11434.0, 11435.0, 11210.0,\n", " 11228.0, 11368.0, 11694.0, 10464.0, 11415.0, 10314.0, 10301.0,\n", " 10018.0, 10038.0, 11105.0, 11230.0, 10468.0, 11104.0, 10471.0,\n", " 11416.0, 10075.0, 11422.0, 11355.0, 10028.0, 10462.0, 10306.0,\n", " 10461.0, 11224.0, 11429.0, 10035.0, 11366.0, 11362.0, 11206.0,\n", " 10460.0, 10304.0, 11360.0, 11411.0, 10455.0, 10475.0, 10069.0,\n", " 10303.0, 10308.0, 10302.0, 11357.0, 10470.0, 11367.0, 11370.0,\n", " 10454.0, 10451.0, 11436.0, 11426.0, 10153.0, 11004.0, 11428.0,\n", " 11427.0, 11001.0, 11363.0, 10004.0, 10474.0, 11430.0, 10000.0,\n", " 10307.0, 11239.0, 10119.0, 10006.0, 10048.0, 11697.0, 11692.0,\n", " 11693.0, 10573.0, 83.0, 11559.0, 10020.0, 77056.0, 11776.0, 70711.0,\n", " 10282.0, 11109.0, 10044.0, '10452', '11233', '10468', '10310',\n", " '11105', '10462', '10029', '10301', '10457', '10467', '10469',\n", " '11225', '10035', '10031', '11226', '10454', '11221', '10025',\n", " '11229', '11235', '11422', '10472', '11208', '11102', '10032',\n", " '11216', '10473', '10463', '11213', '10040', '10302', '11231',\n", " '10470', '11204', '11104', '11212', '10466', '11416', '11214',\n", " '10009', '11692', '11385', '11423', '11201', '10024', '11435',\n", " '10312', '10030', '11106', '10033', '10303', '11215', '11222',\n", " '11354', '10016', '10034', '11420', '10304', '10019', '11237',\n", " '11249', '11230', '11372', '11207', '11378', '11419', '11361',\n", " '10011', '11357', '10012', '11358', '10003', '10002', '11374',\n", " '10007', '11234', '10065', '11369', '11434', '11205', '11206',\n", " '11415', '11236', '11218', '11413', '10458', '11101', '10306',\n", " '11355', '10023', '11368', '10314', '11421', '10010', '10018',\n", " '11223', '10455', '11377', '11433', '11375', '10037', '11209',\n", " '10459', '10128', '10014', '10282', '11373', '10451', '11238',\n", " '11211', '10038', '11694', '11203', '11691', '11232', '10305',\n", " '10021', '11228', '10036', '10001', '10017', '11217', '11219',\n", " '10308', '10465', '11379', '11414', '10460', '11417', '11220',\n", " '11366', '10027', '11370', '10309', '11412', '11356', '10456',\n", " '11432', '10022', '10013', '11367', '11040', '10026', '10475',\n", " '11210', '11364', '11426', '10471', '10119', '11224', '11418',\n", " '11429', '11365', '10461', '11239', '10039', '00083', '11411',\n", " '10075', '11004', '11360', '10453', '10028', '11430', '10307',\n", " '11103', '10004', '10069', '10005', '10474', '11428', '11436',\n", " '10020', '11001', '11362', '11693', '10464', '11427', '10044',\n", " '11363', '10006', '10000', '02061', '77092-2016', '10280', '11109',\n", " '14225', '55164-0737', '19711', '07306', '000000', 'NO CLUE',\n", " '90010', '10281', '11747', '23541', '11776', '11697', '11788',\n", " '07604', 10112.0, 11788.0, 11563.0, 11580.0, 7087.0, 11042.0,\n", " 7093.0, 11501.0, 92123.0, 0.0, 11575.0, 7109.0, 11797.0, '10803',\n", " '11716', '11722', '11549-3650', '10162', '92123', '23502', '11518',\n", " '07020', '08807', '11577', '07114', '11003', '07201', '11563',\n", " '61702', '10103', '29616-0759', '35209-3114', '11520', '11735',\n", " '10129', '11005', '41042', '11590', 6901.0, 7208.0, 11530.0,\n", " 13221.0, 10954.0, 11735.0, 10103.0, 7114.0, 11111.0, 10107.0], dtype=object)" ] } ], "prompt_number": 47 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 7.3 Fixing the nan values and string/float confusion" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can pass a `na_values` option to `pd.read_csv` to clean this up a little bit. We can also specify that the type of Incident Zip is a string, not a float." ] }, { "cell_type": "code", "collapsed": false, "input": [ "na_values = ['NO CLUE', 'N/A', '0']\n", "requests = pd.read_csv('../data/311-service-requests.csv', na_values=na_values, dtype={'Incident Zip': str})" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "requests['Incident Zip'].unique()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 4, "text": [ "array(['11432', '11378', '10032', '10023', '10027', '11372', '11419',\n", " '11417', '10011', '11225', '11218', '10003', '10029', '10466',\n", " '11219', '10025', '10310', '11236', nan, '10033', '11216', '10016',\n", " '10305', '10312', '10026', '10309', '10036', '11433', '11235',\n", " '11213', '11379', '11101', '10014', '11231', '11234', '10457',\n", " '10459', '10465', '11207', '10002', '10034', '11233', '10453',\n", " '10456', '10469', '11374', '11221', '11421', '11215', '10007',\n", " '10019', '11205', '11418', '11369', '11249', '10005', '10009',\n", " '11211', '11412', '10458', '11229', '10065', '10030', '11222',\n", " '10024', '10013', '11420', '11365', '10012', '11214', '11212',\n", " '10022', '11232', '11040', '11226', '10281', '11102', '11208',\n", " '10001', '10472', '11414', '11223', '10040', '11220', '11373',\n", " '11203', '11691', '11356', '10017', '10452', '10280', '11217',\n", " '10031', '11201', '11358', '10128', '11423', '10039', '10010',\n", " '11209', '10021', '10037', '11413', '11375', '11238', '10473',\n", " '11103', '11354', '11361', '11106', '11385', '10463', '10467',\n", " '11204', '11237', '11377', '11364', '11434', '11435', '11210',\n", " '11228', '11368', '11694', '10464', '11415', '10314', '10301',\n", " '10018', '10038', '11105', '11230', '10468', '11104', '10471',\n", " '11416', '10075', '11422', '11355', '10028', '10462', '10306',\n", " '10461', '11224', '11429', '10035', '11366', '11362', '11206',\n", " '10460', '10304', '11360', '11411', '10455', '10475', '10069',\n", " '10303', '10308', '10302', '11357', '10470', '11367', '11370',\n", " '10454', '10451', '11436', '11426', '10153', '11004', '11428',\n", " '11427', '11001', '11363', '10004', '10474', '11430', '10000',\n", " '10307', '11239', '10119', '10006', '10048', '11697', '11692',\n", " '11693', '10573', '00083', '11559', '10020', '77056', '11776',\n", " '70711', '10282', '11109', '10044', '02061', '77092-2016', '14225',\n", " '55164-0737', '19711', '07306', '000000', '90010', '11747', '23541',\n", " '11788', '07604', '10112', '11563', '11580', '07087', '11042',\n", " '07093', '11501', '92123', '00000', '11575', '07109', '11797',\n", " '10803', '11716', '11722', '11549-3650', '10162', '23502', '11518',\n", " '07020', '08807', '11577', '07114', '11003', '07201', '61702',\n", " '10103', '29616-0759', '35209-3114', '11520', '11735', '10129',\n", " '11005', '41042', '11590', '06901', '07208', '11530', '13221',\n", " '10954', '11111', '10107'], dtype=object)" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 7.4 What's up with the dashes?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "rows_with_dashes = requests['Incident Zip'].str.contains('-').fillna(False)\n", "len(requests[rows_with_dashes])" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 6, "text": [ "5" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "requests[rows_with_dashes]" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", " | Unique Key | \n", "Created Date | \n", "Closed Date | \n", "Agency | \n", "Agency Name | \n", "Complaint Type | \n", "Descriptor | \n", "Location Type | \n", "Incident Zip | \n", "Incident Address | \n", "Street Name | \n", "Cross Street 1 | \n", "Cross Street 2 | \n", "Intersection Street 1 | \n", "Intersection Street 2 | \n", "Address Type | \n", "City | \n", "Landmark | \n", "Facility Type | \n", "Status | \n", "Due Date | \n", "Resolution Action Updated Date | \n", "Community Board | \n", "Borough | \n", "X Coordinate (State Plane) | \n", "Y Coordinate (State Plane) | \n", "Park Facility Name | \n", "Park Borough | \n", "School Name | \n", "School Number | \n", "School Region | \n", "School Code | \n", "School Phone Number | \n", "School Address | \n", "School City | \n", "School State | \n", "School Zip | \n", "School Not Found | \n", "School or Citywide Complaint | \n", "Vehicle Type | \n", "Taxi Company Borough | \n", "Taxi Pick Up Location | \n", "Bridge Highway Name | \n", "Bridge Highway Direction | \n", "Road Ramp | \n", "Bridge Highway Segment | \n", "Garage Lot Name | \n", "Ferry Direction | \n", "Ferry Terminal Name | \n", "Latitude | \n", "Longitude | \n", "Location | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
29136 | \n", "26550551 | \n", "10/24/2013 06:16:34 PM | \n", "NaN | \n", "DCA | \n", "Department of Consumer Affairs | \n", "Consumer Complaint | \n", "False Advertising | \n", "NaN | \n", "77092-2016 | \n", "2700 EAST SELTICE WAY | \n", "EAST SELTICE WAY | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "HOUSTON | \n", "NaN | \n", "NaN | \n", "Assigned | \n", "11/13/2013 11:15:20 AM | \n", "10/29/2013 11:16:16 AM | \n", "0 Unspecified | \n", "Unspecified | \n", "NaN | \n", "NaN | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "N | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
30939 | \n", "26548831 | \n", "10/24/2013 09:35:10 AM | \n", "NaN | \n", "DCA | \n", "Department of Consumer Affairs | \n", "Consumer Complaint | \n", "Harassment | \n", "NaN | \n", "55164-0737 | \n", "P.O. BOX 64437 | \n", "64437 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "ST. PAUL | \n", "NaN | \n", "NaN | \n", "Assigned | \n", "11/13/2013 02:30:21 PM | \n", "10/29/2013 02:31:06 PM | \n", "0 Unspecified | \n", "Unspecified | \n", "NaN | \n", "NaN | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "N | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
70539 | \n", "26488417 | \n", "10/15/2013 03:40:33 PM | \n", "NaN | \n", "TLC | \n", "Taxi and Limousine Commission | \n", "Taxi Complaint | \n", "Driver Complaint | \n", "Street | \n", "11549-3650 | \n", "365 HOFSTRA UNIVERSITY | \n", "HOFSTRA UNIVERSITY | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "HEMSTEAD | \n", "NaN | \n", "NaN | \n", "Assigned | \n", "11/30/2013 01:20:33 PM | \n", "10/16/2013 01:21:39 PM | \n", "0 Unspecified | \n", "Unspecified | \n", "NaN | \n", "NaN | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "N | \n", "NaN | \n", "NaN | \n", "NaN | \n", "La Guardia Airport | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
85821 | \n", "26468296 | \n", "10/10/2013 12:36:43 PM | \n", "10/26/2013 01:07:07 AM | \n", "DCA | \n", "Department of Consumer Affairs | \n", "Consumer Complaint | \n", "Debt Not Owed | \n", "NaN | \n", "29616-0759 | \n", "PO BOX 25759 | \n", "BOX 25759 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "GREENVILLE | \n", "NaN | \n", "NaN | \n", "Closed | \n", "10/26/2013 09:20:28 AM | \n", "10/26/2013 01:07:07 AM | \n", "0 Unspecified | \n", "Unspecified | \n", "NaN | \n", "NaN | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "N | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
89304 | \n", "26461137 | \n", "10/09/2013 05:23:46 PM | \n", "10/25/2013 01:06:41 AM | \n", "DCA | \n", "Department of Consumer Affairs | \n", "Consumer Complaint | \n", "Harassment | \n", "NaN | \n", "35209-3114 | \n", "600 BEACON PKWY | \n", "BEACON PKWY | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "BIRMINGHAM | \n", "NaN | \n", "NaN | \n", "Closed | \n", "10/25/2013 02:43:42 PM | \n", "10/25/2013 01:06:41 AM | \n", "0 Unspecified | \n", "Unspecified | \n", "NaN | \n", "NaN | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "N | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
5 rows \u00d7 52 columns
\n", "\n", " | Unique Key | \n", "Created Date | \n", "Closed Date | \n", "Agency | \n", "Agency Name | \n", "Complaint Type | \n", "Descriptor | \n", "Location Type | \n", "Incident Zip | \n", "Incident Address | \n", "Street Name | \n", "Cross Street 1 | \n", "Cross Street 2 | \n", "Intersection Street 1 | \n", "Intersection Street 2 | \n", "Address Type | \n", "City | \n", "Landmark | \n", "Facility Type | \n", "Status | \n", "Due Date | \n", "Resolution Action Updated Date | \n", "Community Board | \n", "Borough | \n", "X Coordinate (State Plane) | \n", "Y Coordinate (State Plane) | \n", "Park Facility Name | \n", "Park Borough | \n", "School Name | \n", "School Number | \n", "School Region | \n", "School Code | \n", "School Phone Number | \n", "School Address | \n", "School City | \n", "School State | \n", "School Zip | \n", "School Not Found | \n", "School or Citywide Complaint | \n", "Vehicle Type | \n", "Taxi Company Borough | \n", "Taxi Pick Up Location | \n", "Bridge Highway Name | \n", "Bridge Highway Direction | \n", "Road Ramp | \n", "Bridge Highway Segment | \n", "Garage Lot Name | \n", "Ferry Direction | \n", "Ferry Terminal Name | \n", "Latitude | \n", "Longitude | \n", "Location | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
42600 | \n", "26529313 | \n", "10/22/2013 02:51:06 PM | \n", "NaN | \n", "TLC | \n", "Taxi and Limousine Commission | \n", "Taxi Complaint | \n", "Driver Complaint | \n", "NaN | \n", "00000 | \n", "EWR EWR | \n", "EWR | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NEWARK | \n", "NaN | \n", "NaN | \n", "Assigned | \n", "12/07/2013 09:53:51 AM | \n", "10/23/2013 09:54:43 AM | \n", "0 Unspecified | \n", "Unspecified | \n", "NaN | \n", "NaN | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "N | \n", "NaN | \n", "NaN | \n", "NaN | \n", "Other | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
60843 | \n", "26507389 | \n", "10/17/2013 05:48:44 PM | \n", "NaN | \n", "TLC | \n", "Taxi and Limousine Commission | \n", "Taxi Complaint | \n", "Driver Complaint | \n", "Street | \n", "00000 | \n", "1 NEWARK AIRPORT | \n", "NEWARK AIRPORT | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NEWARK | \n", "NaN | \n", "NaN | \n", "Assigned | \n", "12/02/2013 11:59:46 AM | \n", "10/18/2013 12:01:08 PM | \n", "0 Unspecified | \n", "Unspecified | \n", "NaN | \n", "NaN | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "Unspecified | \n", "N | \n", "NaN | \n", "NaN | \n", "NaN | \n", "Other | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
\n", " | Incident Zip | \n", "Descriptor | \n", "City | \n", "
---|---|---|---|
71834 | \n", "23502 | \n", "Harassment | \n", "NORFOLK | \n", "
47048 | \n", "23541 | \n", "Harassment | \n", "NORFOLK | \n", "
85821 | \n", "29616 | \n", "Debt Not Owed | \n", "GREENVILLE | \n", "
89304 | \n", "35209 | \n", "Harassment | \n", "BIRMINGHAM | \n", "
94201 | \n", "41042 | \n", "Harassment | \n", "FLORENCE | \n", "
30939 | \n", "55164 | \n", "Harassment | \n", "ST. PAUL | \n", "
80573 | \n", "61702 | \n", "Billing Dispute | \n", "BLOOMIGTON | \n", "
13450 | \n", "70711 | \n", "Contract Dispute | \n", "CLIFTON | \n", "
12102 | \n", "77056 | \n", "Debt Not Owed | \n", "HOUSTON | \n", "
29136 | \n", "77092 | \n", "False Advertising | \n", "HOUSTON | \n", "
44008 | \n", "90010 | \n", "Billing Dispute | \n", "LOS ANGELES | \n", "
71001 | \n", "92123 | \n", "Billing Dispute | \n", "SAN DIEGO | \n", "
57636 | \n", "92123 | \n", "Harassment | \n", "SAN DIEGO | \n", "
13 rows \u00d7 3 columns
\n", "